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The facial emotion recognition by the machine is a challenging task. From 
decades, researchers applied different methods to classify facial emotion into 
the different classes. The expansion of artificial intelligence in a form of deep 
convolutional neural network (CNN) changed the direction of the research. 
The facial emotion recognition using deep CNN is powerful in terms of taking 
bulk input images for processing and classify with high accuracy. It has been 
noticed in a few cases the classification model does not judge the facial 
images into appropriate classes due to the influence of noises. So, it is highly 
recommended to apply a noiseless image to the facial emotion recognition 
model for classification. We adopted a mechanism and proposed a model for 
classifying facial image into one of the seven classes with high accuracy. The 
images are smoothed before applying to the model by different smoothing 
process as part of image preprocessing. We claim facial emotion recognition 
with image smoothing by different filters or a mixture of filter are more robust 
than without preprocessing. The detail is explained in the subsequent sections. 
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1. INTRODUCTION 

Image is a set of pixels, represented by the function f(x, y) such that, x € domain(x — axis) and 
y €range(y — axis) of an image having the scalar quantity, is equivalent to the amount of energy radiated 
from the place image is taken. Suppose f(a,b) designates an image of continuous variables which is 
converted into digital image in a form of f(x,y) where x € {0,1,2,..,M — 1} and y € {0,1,2,..,N — 1} Here 
M, N are the length and breadth of the digital image. Following is the matrix representation of above image 
definition in (1): 


f(0,0)  - f,M—1) 
me Fon” 7 


f@y)= ( ; 
f(N — 1,0) 


f(N- LM —1) 


Here, each of f(x;,y;) represents spatially to a pixel of(x;,y;). For every pixel (x;,y,) such that 
0<ij<M,N. The four neighboring pixels of any of the pixels (x;,y;) are represented by 
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(xis ¥;)) (xi-1y;), (Xp Vina); (Xj, yj-1). Often, the image cannot be analyzed in true sense due to its bad 
quality and amount of the noise present [1]. The corrupted image is presented as (2): 


g(x,y) =f y) +n y) (2) 


where f (x, y): Noiseless image and (x, y): Noises present in the image. 

The presence of noise corrupts partially or in a regularly at different portions of the image. As a 
result, the image knowledge extraction may not be in a true sense. For recovery, the quality of the image 
from the noise image filtering is used. According to [2] there are several filters like average, median, 
gaussian, and bilateral are used to smooth the image. In this situation the convolution is used and is 
represented by operator @ applied on f(x,y) with the impulse response of g(x,y) create smooth image 
h(x, y) explained as (3). 


h(x, y) = fy) @ gy) = fo 2 Fy g(x — xy — y"dx'dy' (3) 


Ali jl] = fli/1 @ gli J] (4) 


The human face represents some sensible information which changes from time to time [3] with 
external or internal influence. In this article we have demonstrated the facial emotion recognition model by 
applying artificial intelligence. The input to this model is filtered by different filters as a part of image 
preprocessing that lead by higher accuracy compared without smoothing. The facial emotion recognition 
begins from Darwin, [4] said there are 40 human expressions curves a human face poses after perceiving 
inputs from the environment. The action units [5], [6] of the face are the fundamental unit of the expression 
which contain sensitive information of expression. Convolutional neural network (CNN) consist of 
convolutional layer, pooling layer and fully connected network [7] is the most interesting tool and 
technology that, produces promising result [8] for any high-level scientific computation [9], [10]. 
Convolutional neural networks are not only for facial emotion recognition that we applied in the under 
described research, but also in several classifications such as human disease classification [11], [12], and 
plant disease classification [13]. Before deep CNN quite popular, the image classification uses a different 
machine learning algorithms and methods to classify in applications like brain tumor [14], [15], Plant 
disease [16], [17] and other [18], [19]. 

We have adopted a deep CNN in our research. The input to the architecture is preprocessed facial 
image which is filtered by various filters [20] as a result the quality of the image is enhanced. Filters have 
different measures for smoothing the image by removing impulse noise as per the function it uses. The 
convolutional neural network accepts smoothed image and train an artificial intelligence model for facial 
emotion recognition that is either happy, sad, fear, disgust, neutral, surprise and angry. In the general 
complexity of the model increases and accuracy decreases as number of the classes increases that are more 
challenging. We claim our model stood well for a wide variety of emotion classification with high 
accuracy. 

The primary input to facial emotion recognition model is an image. The training of the model is 
influenced depending on the amount of noises are in the images. It is believed that the smoothed image is 
more robust than not. The filters that smooth images are average, median, gaussian and bilateral each filter 
have its own pros and cons. However, most of them cannot well recover a heavy noise corrupted image 
with noise density above 70% to preserve the detailed information of an image [21]. The median filter and 
its different variants are extensively used [22] to reduce the impulse noise from grayscale images and the 
performance is increased. Averaging the pixel intensities with respect to the size of the filter is a common 
method for smoothing the image, but fuzzy averaging [23] reduces impulses in a large way. Identify the 
pixels belonging to the borders, then apply a reduced smoothing and applying more intense smoothing to 
the remaining pixels produced a standard result [24] in the ultrasound image application. 

The median filtering is a good choice of noise reduction. An improved median filtering algorithm 
[25] uses the correlation of the image to process the features of the filtering mask over the image. Median 
filtering based on combined features of different image that, consist of joint conditional probability density 
functions, principal component analysis is used to reduce the dimension is performing on the uncompressed 
image datasets. A new proposed method [26] uses a median filter using prior information to capture natural 
pixels for restoration, this method restores corrupted images with 99% level of salt-and-pepper impulse 
noise. Switching among the median and mean [27] by detecting a filter is a proved method of smoothing. 

Gaussian function used for gaussian blur [28], is a kind of normal distribution. The original pixel 
having the highest intensity is replaced by maximum gaussian weight and proportionally the lower intensity 
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is replaced by low gaussian weight. The review article [29] is a good collection of gaussian filers used in 
different applications and explained the advantages of this filter with respect to others. 

The noise reduction along with preserving edge information [30] smoothing achieves using the 
[31] bilateral filter. Here the intensity of each of the pixel is substituted by a weighted average of an 
intensity calculated from the nearby pixels. The framework for image denoising [32] and suppresses mixed 
noise in color images [33] are a few of the advance example using the bilateral filter. The remaining of the 
paper is organized into the sections as: Section 2: Research method, section 3: Result and discussion and 
section 4: Conclusion. 


2. RESEARCH METHOD 
2.1. Dataset description 

The renowned datasets FER2013 and CK48+ datasets are used for experimentation in the 
proposed model. CK48+, Fer2013 datasets consists of 3540, 35887 images related to seven different facial 
expressions such as happy, angry, sad, surprise, neutral, disgust, and fear, respectively. All the images are 
normalized, standardized by using standardization and normalization techniques, all the images are resized 
into a fixed dimension of 48X48 to maintain uniformity. 


2.2. Filter description 

The basic focus of our research is to observe facial emotion classification and its accuracy 
achievements for smoothed input images. The images undergone through different smoothing process and 
observation is tabulated in experimental section. For smoothing the images, a hybrid smoothing filter is 
proposed which is formed by the combination of average, median, gaussian, bilateral filters and their 
performances are compared. The equations used in each of the filters are as mentioned is: average filtering 
in (5), median filtering in (6), gaussian in (7) for 1D and in (8) for 2D, bilateral in (9), 


1 


1 
Img(x,y) = BS > 1 * ImMgact (X tiny +f) 


je-1 i=-1 
1 

IMGnorm(x.y) = ar mg (x, y) (5) 

aay aan 
Img(x,y) = median{Imgace (x +i y + )IGJ € R))} (6) 
di. ERE 

G(x) = pa (7) 
4 reP ty?) 

G(x,y) = ae (8) 


Img Filtered — a Img (xi) frange (mod(Img (xj) = Img (x) )G, (mod (x; a x)) 
W, = py frange (mod(Img(x;) = Img (x))G, (mod (x; = x)) (9) 


2.3. Model description 

In the devised model a facial emotion recognition image dataset is taken and is converted to a 
hybrid image set by applying various smoothing techniques. 
Step 1: Initially, n random images from the image set is selected by using randSelect function proposed in 
the algorithm. 
Step 2: Average filtering is applied on the randomly selected images and the resulted images are stored in 
hybrid image set, the random images selected are removed from the original image set. 
Step 3: The same process is repeated by using median, gaussian, and bilateral filters and a hybrid image set 
is formed from different filtered images. 
Step 4: Assign labels to the resulted hybrid image set 
Step 5: Divide the hybrid image set in the ratio of 80:15 for training and testing purpose 
Step 6: Train the proposed CNN model with selected images for training and evaluate with the images 
selected for testing for training and evaluation. 


2.3.1. Algorithm 
The stages in the algorithm illustrate the process in evaluating a face picture as an input into an 
emotion class. The algorithm uses three functions: hybrid filtering, randSelect, and FacEmoRec. The 
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hybrid filtering function chooses pictures that are filtered using average, median, bilateral, and gaussian 
methods. FacEmoRec classifies photos based on emotion using the randSelect function, which randomly 
picks photos from the original dataset on which filtering should be done. 


Algorithm 1 
Input:Image Set Consisting of Seven Different Facial Expressions 
Output:Hybrid Image Set After Applying Differnt Filtering Techniques 
Hybrid Filtering Alg(Image Set[ ]) 
HybridImgSt[ ]={6} 
Begin 
if len(ImageSet[ ]# 0) then // 
Beginif 
for all the images in ImageSet 


resize(ImageSet[ ],48,48)//Resizes all images into size of 
48X48 
Normalize (ImageSet[ ])//Normalizies all the images 
l=len(ImageSet[ ]) 
while(1>0) 
Beginloop 
imgi[ ]=randSelect (ImageSet[ ],1) 
imgri[ ]=AvgFilter(imgil ], (3,3)) 
HybridImgSt[ ]=HybridImgSt[ ]U imgzil ] 


ImageSet=ImageSet-imgil[ ] 
1=l-len(imgil[ ]) 


img;[ ]=randSelect (ImageSet[ ],1) 
imgs;[ ]=MedianFilter(img;[ ], (3,3) 
HybridImgSt[ ]=HybridImgSt[ ]U imgs;[ ] 


ImageSet=ImageSet-imgj[ ] 
1=l-len(img;[ ]) 


imgx[ ]=randSelect (ImageSet[ ],1) 

imgex[ ]=GaussianFilter(imgx[ ], (3,3)) 

HybridImgSt[ ]=HybridImgSt[ ]U imgex[ ] 
ImageSet=ImageSet-imgx[ ] 1=1- 
len(imgx[ ]) 

imgril ]=BilateralFilter(Imageset[ ], (3,3)) 

HybridImgSt[ ]=HybridImgSt[ ]U imgzil ] 


ImageSet=ImageSet-ImageSet[ ] 
1=l-len(ImageSet[ ]) 


Endloop 
Endif 
End 
randSelect (ImageSet[ ],1) 
Begin_Function 


do_loop 
n=randInt( ) 
while(n>l ) 
return (rand(Img[n]) ) 
End_Function 
Input:HybridImgSt Consisting Images of Seven Different Facial Expressions After Filtering 
Output:Classification of Images Based On Expression Type 
FacEmoRec (HybridImgSt) //Emotion Recognizing Model 


Begin 
l=len (HybridImgSt) 
for iintltol 
Begin 
LabImgSt«Label (HybridImgSt[imgi])//Assigns Labels to Images 
TrSt,TsSt-Split (LabImmSt, 85,15) 
CNN Model-CNN Model (TrSt) 
Evaluation-CNN Model (TsSt) 
End 
return Classified Emotion// Classified Emotions will be returned 
end 


2.3.2. Flow chart for the proposed model 

Figure 1 describes the application of different filters, average, median, gaussian, and bilateral to 
the image dataset consist of finite images. All images passed through different filters are equal to the total 
number of images in the actual dataset. The filtered images are applied to the model for training and 
evaluation. 
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Figure 1. Develop a model with a hybrid dataset whose performance is measured through a proper 
evaluation plan 


3. RESULTS AND DISCUSSION 

High computation speeds in terms of graphical processing unit (GPU), central processing unit 
(CPU) and memory are required to build a hybrid image filter algorithm and to build a CNN model for 
evaluating the performances of the hybrid image filter dataset. We took the support of Google Colab cloud 
service support for developing the above-mentioned models. The configuration of the cloud service used is 
described as: 

Frequency of CPU: 2.30 GHz, GPU Used: NIVIDIA (12GB), Size of Disk Space Supported: 
25 GB, Editor Used: Jupiter Notebook. CK48+, Fer2013 datasets that consists of 3540, 35887 images 
related to seven different facial expressions such as happy, angry, sad, surprise, neutral, disgust, and fear 
are considered for experimentation. Average, median, gaussian, bilateral and the proposed filter hybrid 
filters are considered for filtering the datasets and the resulted images are given for a CNN model for 
evaluation. It is observed that the images that were considered as inputs to the CNN model after applying 
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filtering produced better results when compared to the images where filtering is not applied produced better 
results when compared to the images where filtering is not applied. Figure 2 represents accuracy and loss 
comparisons that are obtained from the model without filtering and with average and median filtering 
techniques applied to CK48+ dataset. Figure 2(a) represents train and test loss comparisons without 
filtering, Figure 2(b) represents train and loss comparisons when Average filtering is applied and 
Figure 2(c) represents train and test loss applied when median filtering is applied on CK48+ dataset. 
Figure 3 represents accuracy and loss comparisons that are obtained from the model with gaussian, bilateral 
and proposed hybrid filtering techniques applied to CK48+ dataset. Figure 3(a) represents train and test loss 
comparisons of gaussian filtering, Figure 3(b) represents train and loss comparisons when bilateral filtering 
is applied and Figure 3(c) represents train and test loss applied when hybrid filtering is applied on CK48+ 
dataset. 


Actual Train _loss vs Actual Val_loss Average Filter Train_loss vs Average Filter Val_loss 
18 
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184 —— Actual Val_loss 
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Figure 2. Performance measures without filtering and applying average and mean filtering on CK48+ 
dataset; (a) performance measures without applying filtering on CK48+ dataset, (b) performance measures 
after applying average filtering on CK48+ data set and (c) performance measures after applying median 
filtering on CK48+ dataset 
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Figure 3. Performance measures after applying gaussian, bilateral and hybrid filtering on CK48+ dataset; 
(a) performance measures after applying gaussian filtering on CK48+ dataset, (b) performance measures 
after applying bilateral filtering on CK48+ dataset and (c) performance measures after applying hybrid 
filtering on CK48+ dataset 


The Figure 4 represents accuracy and loss comparisons that are obtained from the model without 
filtering and with average and median filtering techniques applied to FER2013 dataset. Figure 4(a) 
represents train and test loss comparisons without filtering, Figure 4(b) represents train and loss 
comparisons when Average filtering is applied and Figure 4(c) represents train and test loss applied when 
median filtering is applied on FER2013 dataset. Figure 5 represents accuracy and loss comparisons that are 
obtained from the model with gaussian, bilateral and proposed hybrid filtering techniques applied to 
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FER2013 dataset. Figure 5(a) represents train and test loss comparisons of gaussian filtering, Figure 5(b) 
represents train and loss comparisons when bilateral filtering is applied and Figure 5(c) represents train and 
test loss applied when hybrid filtering is applied on FER2013 dataset. Table 1 expresses the performance 
comparative analysis of train and test accuracy, loss and time taken for each epoch execution of the model 
with filtering and without filtering compared to the proposed hybrid filtering technique applied on 
FER2013 dataset. 
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Figure 4. Performance measures without filtering and applying average and mean filtering on FER2013 
dataset; (a) performance measures without applying filtering on FER2013 data set, (b) performance 
measures after applying average filtering on FER2013 data set and (c) performance measures after applying 
median filtering on FER2013 dataset 
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Table 2 expresses the performance comparative analysis of train and test accuracy, loss and time 
taken for each epoch execution of the model with filtering and without filtering compared to the proposed 
hybrid filtering technique applied on CK48+ dataset. Figure 6 is a bar chart of accuracy levels that are 
obtained from the model with filtering and without filtering compared to the proposed Hybrid filtering 
technique applied on CK48+ and FER2013 datasets. Figure 7 is a bar chart of loss levels that are obtained 
from the model with filtering and without filtering compared to the proposed hybrid filtering technique 
applied on CK48+ and FER2013 datasets. 
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Figure 5. Performance measures after applying gaussian, bilateral and hybrid filtering on FER2013 dataset, 
(a) performance measures after applying gaussian filtering on FER2013 data set, (b) performance measures 
after applying bilateral filtering on FER2013 data set and (c) performance measures after applying hybrid 

filtering on FER2013 dataset 
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Table 1. FER2013 performance comparative analysis using various filters 


FER-2013 
S.No _ Type of filtering applied Train Accuracy Test Accuracy Train loss Test loss __ Exec time per epoch 
1 Average Filter 86.2 61.89 0.383 1.201 17 Sec 
2: Median Filter 85.81 62.54 0.934 1.169 16 Sec 
3 Gaussian Filter 86.34 62.67 0.381 1.193 16 Sec 
4 Bilateral Filter 84.34 61.65 0.434 1.193 12 Sec 
5 Hybrid Filter 85.66 63.37 0.400 1.178 10 Sec 
6 Without Filtering 86.21 61.57 0.382 1.118 17 Sec 
Table 2. CK48+ performance comparative analysis using various filters 
CK 48+ 
S.No _ Type of filtering applied Train accuracy Test accuracy  Trainloss Test loss _ Exec time per epoch 
1 Average Filter 78.39 74.76 0.525 0.470 1 Sec 
2 Median Filter 77.3 77.59 0.464 0.474 2 Sec 
3 Gaussian Filter 77.37 75.71 0.467 0.491 1 Sec 
4 Bilateral Filter THI 74.95 0.473 0.478 1 Sec 
5 Hybrid Filter 79.59 78.72 0.438 0.491 1 Sec 
6 Without Filtering 65.2 69.68 0.701 0.605 1 Sec 
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Figure 6. Accuracy comparisons after applying various filters on CK48+ and FER2013 datasets 
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Figure 7. Loss comparisons after applying various filters on CK48+ and FER2013 datasets 
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4. CONCLUSION 

The research described in this article is a robust deep convolutional neural network (CNN) model 
for facial emotion recognition into one of the seven classes. In this proposed model the input is a mixture of 
smoothed images produced by different smoothing filters. The model resulted in reasonable performance in 
terms of accuracy, loss on the test dataset trained using CK48+ and FER 2013 mixed smoothed images. This 
can be extended to find out most suitable filter for an image which may further increase the accuracy level 
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